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Abstract. Finding the maximum clique is a known NP-Complete problem 
and it is also hard to approximate. This work proposes two efficient algorithms 
to obtain it. Nevertheless, the first one is able to fins the maximum for some 
special cases, while the second one has its execution time bounded by the 
number of cliques that each vertex belongs to. 

1. Introduction 

Finding cliques in graphs is a well known problem, mainly the maximum clique 
was found to be a NP-Complete problem |Karp, 1972] . Indeed, from any vertex, to 
discover the maximum clique to which it belongs, we should take all combinations 
of k neighbors and verify whenever they are mutually adjacent, which yields an 
exponential time as a function of the vertex degree. Moreover, [Johnson, 1973 



shows that there exists no sublinear approximation algorithm. This problem was 
largely treated and an extensive survey can be found in [Bomze et al., 1999] . In 
this work, we present two algorithms to find maximum cliques. We decompose 
the graph vertex by vertex until no vertices remain; then, we re-build the graph 
restoring each of the vertices, one by one, in an inverted order computing maximal 
cliques at each step. Next section is devoted to present the algorithms and the 
theorems showing their correctness. A section showing the algorithms applied to 
real graphs is presented to illustrate how they works. The paper is concluded with 
a discussion about the complexity of the problem. 

2. Algorithms 

Let G = (V, E) be a simple undirected graph with n = vertices and m = 
\E\ < \V\ X \V\ edges. The neighborhood of a vertex is the set composed by vertices 
directly connected to it, i.e., w G N{v) such as {v,w} G E. Then, the degree of 
vertex v is denoted as d{v) = \N{v)\. Let us call H — {V,E,A) the annotated 
graph G, where \A\ = \V\ such that ii v ^ V then is a list of attributes of vertex 

V. 

Attributes are list of sets, each of one a clique, and they are computed by 
the proposed algorithms. For Algorithm [21 we denote an element of the list as a 
set L G a^, its initialization with set L as {L}, and the append of set L as 

a„ i— {tty, L} and the elimination of a set L as {av\L}. The elements of this 
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list can also interpreted as a* for i E N{v) (see Algorithm [T]) . A list of maximal 
cliques is stated by Definition [TJ 

Definition 1. Given a vertex v and the list ly where its elements are sets Si 
having v, i.e., Si Z) v. These sets Si G 1^ are maximal if for all i f they verify 
the following properties: 
(i) vQSif] Sj 

(a) Si ^ Sj 

(Hi) Si ^ Sj 

Therefore, lists Uy have the properties presented in the following Definition [21 

Definition 2. Let a vertex v E V the he a list composed by sets of vertices Ai, 
such that the following properties hold: 

(i) each set Ai G denotes a maximal clique having v (see Definition]^; 
(a) the maximum clique ifmax(G) of the graph G is found as 

(1) ^max(G) ^A if ^ = max {\Ai\ : Ai £ e [1, |a„|]) , 

where \ay\ refers to the length of list. 

Notice, firstly, each set Ai is maximal in the sense that there not exist other 
set B ^ Ai; secondly sets Ai € are maximal cliques at the time that they are 
computed (see Algorithm [5]) , but at any later time can exist other clique maximal 
containing the Ai . As we demonstrate, computing Equation [1] when Algorithm [2] 
finished leads to obtain the maximum clique. We analyze it cost later. 

We introduce first an algorithm of low complexity that could find the maximum 
clique in certain cases, this is the Algorithm [TJ 

Theorem 1. Algorithm [7| ends if G — (V, E) has finite size, computing the at- 
tributes ay for all vertices v G V and their corresponding neighbors, according to 
the property in Definition]^ 

Proof. Let us start with the winding phase of the recursion, that is, steps [T1[T] and [T] 
are executed until we reach an empty graph G" (see[T]). At this step, the end of 
recursion (step[T]) is found because each call to f ind_cliques(G") function is done 
with a reduced set of vertices = — 1| (and its induced graph), and G has 
finite size; i.e., the recursion is done n times. 

From there, we analyze the unwinding phase. Let's start when steps [T] and [T] 
are executed for the first time: the return of the function will carry a graph with 
just the vertex v, no edges and = (step[T]), that is H — {{v}, 0, ay). Then, the 
following instance(s) can add vertices of degree zero until one instance begins to 
add the first edges (edge), getting a star with leaves (one leaf), because the degree 
is an increasing function (the winding phase was carried out taking the maximum 
degree at step [1] so the unwinding one reconnects vertices with the same degree 
or greater one). At this instance, steps [T] and [1] are executed and step [T] yields 
L ~ {v, x\ because ax is empty (the vertex x has no registered neighbors until 
now). The following conditional sentence is true (step [T] assures a% = 0) setting 
each neighbor as a clique, on both sides, the neighbor vertex = \y,x\ and the 
local one a% = {v,x} (see [H [T] and [T] we consider objects a„ as mutable). 

From this point of the algorithm execution any one of the next vertices could 
build a Kg/ s G N (e.g., s = 3) because a new vertex joining former vertices consti- 
tuting a Kg-i could appear. It is worth remarking that it is not possible to build 
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Algorithm 1: Function H <— f ind_cliques(G) 



Input: a graph G — {V,E) 
Output: a graph H = {V, E, A) 
1.1 begin 

find a vertex v such that d{v) is maximum ; 
set M = N{v); 

G' ^ [V, E') -.V ^V\v, E' ^E\{vx N{v)} ; 



aj; ^ for ah x £ M, or <— 
set H' ^ {0,0,0} ; 
if then 

set H' ■(r- f iiid_cliques(G' 
for each x E M do 

for each set a\., or Qx = 
if (a^ ^ 0), then L < 



if Af 



= do 



{N{v) n 4) U w U 2), else L^vUx 



if 



\L\ 



> K 

^ L 



■ ax — %) then 



end 
end 
end 
end 

set H ^ H' U {v,v X M, ay) 
return H : 



1.21 end 



a Ks+i at this stage (e.g., 3 + 1 = 4). The reason why it is not possible is that 
there are only Ks-i and a new vertex just adds edges between this new vertex and 
the present vertices, although this new vertex will never add an edge between the 
present vertices. In this way, the size of new cliques is an increasing function (either 
the maximum clique remains at same size or it is increased by one vertex). 

Now, we will show the a^ is always a clique. We have also shown that the first 
elements in constitute a clique of two vertices: v and its neighbor x. Considering, 
at any instance in the unwinding phase of the Algorithm (TJ a vertex v has a clique 
stored for each one of its neighbors x in a^. Let's consider, without loss of generality, 
a new neighbor of v, called w, having as neighbors B C af, = Kt, that is N{w) I) 
BUv. When step[T]is executed, either = or = {w, v} (because it is possible 
that ayr\N{w) is empty in[T]), and then ^ BUw, which is also a clique because 
-B C is a clique and w is a neighbor of all vertices in B by hypothesis. 

To conclude the proof is enough to determine if the property in Definition [5] 
is obtained by Algorithm [TJ The initial case was already shown in the second 
paragraph of this proof. Then, considering a case where a maximal clique of vertices 
w and y is Kg, and there exists another clique Kt such that w,y G Kt and t < s. 
As seen in a previous paragraph, vertices can only build cliques that increase the 
previous one by just one vertex. Let's consider that the next vertex z is connected 
with all vertices in Kt and z is at most connected with t — 1 vertices in Kg ■ The 
minimum difference that is needed to distinguish between two cliques is one vertex. 
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Figure 1 . Counterexample for Algorithm [1] which gives one of 
size 4 instead of 5 when vertices are treated an increasing order. 



Taking into account, without loss of generality, that the maximum clique for z is 
KsUz, steps [T] and [1] will select the clique KgUz because at least one neighbor of z, 
let's call it u, has — Kg/i G N{u). At this point, the values a™, a^, and 
will be updated because their size is greater or equal to the size of a clique found 
before (see [1] [T] and [IJ . Thus, if t + 1 < s then Kg remains a maximal clique for w 
and y; or else t + 1 > s and the set of t vertices in Kt plus z is a maximal clique 
for w and y. It is worth remarking that it is possible that a vertex v has several 
cliques with a neighbor x, and the condition in [T] assures that the maximum clique, 
among the known cliques, is taken. Notice that al^ and a™ have the other clique Kg 
stored, but still ift+i = max(|a^|, Va; G N{v)) is the maximum clique, among 
the known cliques, for w because t + 1 > s and the previous maximum clique was 
Kg; the same occurs to y. □ 

Before to analyze its complexity, we remark that Algorithm [1] not guarantee that 
the operation A = max {\Ai\ : Ai G a%, v & V,x N{v)) gives the maximum clique 
^^max(G). In fact. Figure [T] presents a counterexample. Imagine that the unwinding 
phase takes vertices in the order 1, 2, 3, 4, 5, 6, 10, 11, 12, 13, 14. Regarding that list 
ai2 has not the clique {10,11,12} because a}? = {5,6,10,11}, = {3,4,11,12} 
and = {1, 2, 10, 12}; therefore, when vertex 13 is added it never find the clique 
{10, 11, 12, 13}, so the maximum clique is not found. Nevertheless, the maximum 
clique is often found when very few vertices have degree close to dmax and the graph 
is sparse, as it is the case for the so called scale free networks 0- 
Time complexity of Algorithm [H Step [T] of Algorithm [1] is an intersection of 
two sets of size dmax (maximum degree), if both are orderecQ taking O(dmax)- Next, 
we consider loopIU taking c?max times, what determines O(d^ax)- Loop[l]on vertex 
V neighbors [1] takes an extra dmax, giving 0{d^^^). Step [T] has a complexity of 
0{n) if vertices are not ordered, but there is an additive cost expected smaller than 
'^('^max)- Finally, the recursion is done for each vertex of the graph, producing a 



Graphs having a heavy tailed degree distribution which can be bound by a power law. 
^This can be done at the beginning for all vertices in graph G, taking 0(n ■ dmax ■ log(ciinax))- 
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total time complexity of 0{n ■ dmax)- case when neighbors are not ordered, 

we get 0{n ■ dj^^x)- Considering connected graph, we can express the n recursions 
and the visit to all neighbors in loop [T] as visiting all the edges, yielding a time 
complexity of 0{m ■ (if^^^). 

For graphs in general, where c?max could be bound by n, time complexity is 
0{n^). However, for graphs having a heavy tailed degree distribution which can be 
bound by a power la^B P{d) cc d it is possible to find a lower bound. Indeed, 
these graphs have as a bound of dmax, therefore the complexity yields 0(ri^^^) 
for /3 < 3; and for /3 > 3 either the search in[l]or the elimination in [T] dominates, 
reaching the bound O(n^) or 0{n ■ dmax • logdmax) respectively. 
Storage complexity. It can be computed as the space to storage graph G, which 
is 0{n ■ dniax), and the space occupied by all the sets. This last quantity can be 
computed as the length of each set a^, which is bound by dmax and the number of 
them per vertex, which is also bound by dmax, yielding O(fimax) P^r vertex. Thus, 
the total storage complexity is 0(n ■ c?^ax)- 

Now, we present another Algorithm [2] capable to find the maximum clique in 
any graph. 

The following theorem proof the correctness of Algorithm [21 verifying how this 
algorithm accords with Definition [2] 

Theorem 2. Algorithmic ends if G — {V, E) has finite size, computing correctly 
attributes in the list Qy for all vertices v d V , according to Defi,nition\^ 

Proof. Let us start with the winding phase of the recursion, that is, steps El El and El 
are executed until we reach an empty graph G' (see El- At this step, the end of 
recursion (stepEj) is found because each call to f ind_cliques(G") function is done 
with a reduced set of vertices \V'\ = \V — \\ (and its induced graph), and G has 
finite size; i.e., the recursion is done n times. 

Next, we analyze if steps El to El maintains the list according to Definition [1] 
Firstly, suppose that a certain set L in a subset of one or more sets in list, that is 
L C Sj G Qv In this case, for every Sj G Uy , the conditional step El is false because 
the size of L is necessarily smaller than Sj because L C Sj; then the 'else' clause is 
executed. The condition in stepEjis also 'false' by hypothesis L C Sj, and again the 
'else' clause is selected, setting the old variable as TRUE (notice this variable was 
initialized as FALSE in stepE])- Therefore, the conditional sentence in step El will 
be 'false' because the value of old is not changed until step El is executed again, and 
L will not included in a„ . Secondly, suppose that a certain L in not a subset of any 
Ai £ ay, and certain Aj are subsets of L. Considering when condition in El is true, 
then new will be set 'true' only if sets Aj are been considered, and also the set Aj 
is eliminated form list Oy in step El (notice that the case a„ = conditions El and El 
are true because B size is zero and B f) L = B when B — Finally, for the case 
in which L is different to all sets Ai G a„ , new is set to TRUE in step El or step El 
because, either \Ai\ < \L\ and Ai^L, or Ai ^ L, respectively. Consequently, it is 
shown that the algorithm maintains list Oy verifying Definition [1] 

We continue with the algorithm from the end of the winding phase. Let's start 
with the unwinding phase when steps El and El are executed for the first time: the 
return of the function will carry a graph with just the vertex v, no edges and a„ = 

■^Most of the real problems in Complex Systems field has 2 < /3 < 3. 
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Algorithm 2: Function H <— f ind_cliques(G) 



Input: a graph G ~ {V,E) 
Output: a graph H = {V, E, A) 
2.1 begin 

find a vertex v ; 
set M = N{v); 

G' ^ {V',E') -.V = V\v, E' 



E\{vx N{v)} ; 



set H' ^ {0,0,0} ; 
if then 

set H' ^ f iiid_cliques(G') ; 
for each x E M do 

for each set A ^ ax, or ax = % do 

set new ^ FALSE, and old ^ FALSE ; 
if a^; ^ then L ^ {N{v) n A) U w U x else L^vVJx] 
for each set B ^ a^, or a^ = % do 
if \B\ < \L\ then 

if B ^9 and B C L then 
I Oy {ay \ B} ; 
end 

set new — TRU E ; 
else 

if B^ L then 
I set new ^ TRUE ; 
else 

I set old ^ TRUE ; 
end 
end 



end 

if new == TRUE k old == FALSE then 
end 
end 
end 

set H ^ H' U {v,v X M, Uy ) ; 
return H : 



ly ^ {ay,L} 



end 



(step[2|), that is H = ({u}, 0, at,). Then, the fohowmg instance(s) can add vertices 
of degree zero until one instance begins to add the first edges (edge), getting a 
star with leaves (one leaf). At this instance, steps [2] and [5] are executed and stepd] 
yields L — {v, x} because ax is empty (the vertex x has no registered neighbors until 
now). Steps [2] to [2] are executed yielding as result new = TRUE and old = FALSE 
because a„ = 0, and getting finally ay = {{f,a;}} in step [2] Until now, we shown 
the initial phase, where the first cliques of size 2 are stored. 
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We continue to treat the case at any instance in the unwinding phase of the 
Algorithm [TJ 

To verify property (P of Definition ^ we need to show that each set Ai G a„ is a 
clique. As we shown, the initialization phase let always sets composed by a vertex 
w and its neighbors z G N{w), which are cliques of size 2. A new joining vertex v 
neighbor of w will compute the step [2] with a^, ^ as L {N{v) Ci A) U v L) w, 
where can he A = {w,z}. Thus, if z is also a neighbor of v, the result will be 
L = {vjWjz}, which is also a clique. In general, if the set A is a clique, then 
the result of step [2] is also a clique because we find the intersection of A with the 
neighbors of the actual connected vertex v. Then, all the sets Ai e are always 
cliques; and as we already shown that they verify the Definition [1] they are also 
maximal. 

To conclude the proof is enough to determine if the property (jil| of Definition [2] 
is obtained by Algorithm [1] The initial phase was already shown some paragraphs 
before, steps [2] to [2] assure that a new clique L that the joining vertex v is stored 
when it is different to all the previous ones, or any of the previous Ai G are a 
subset of L. Then, when the vertex v is reconnected to the graph, all the maximal 
cliques are computed and stored, and among them the maximum clique that this 
vertex v belongs to at this time of the algorithm; we call it K„iax,j{v)- Moreover, 
it is possible that in a later time of the execution of the algorithm, other vertex w 
found a greater clique containing the last seen Kmax,j{v), but this clique will be 
always found because it will stored in the last reconnected vertex belonging to this 
clique. Therefore, reading all the Ai G for all vertices the maximum clique is 
obtained. □ 

Time complexity of Algorithm [2l We firstly analyze the cost of the central loop 
in steps [2] to [2l Considering that the size of list can be bound by the function 
S'(dinax), and the set operations in steps [2] and [2] are bound by the maximum clique 
size in v, that is 0{dmax), therefore this loop is done in 0{dmax • S{dnia,x))- 

Then, step [2] of Algorithm JT] is an intersection of two sets of size dmax (maximum 



5'((imax) times because |a^| < ^((iinax), giving 0((imax • 'S'^('imax))- Loop [2] on vertex 
V neighbors takes an extra c?max times, giving 0(dmax ' 'S'^(c'inax))- Step [2] has a 
complexity of 0(1). Finally, the recursion is done for each vertex of the graph, 
producing a total time complexity of 0{n ■ d^^x ' 'S'^(rfmax))- Considering graphs 
with m ^ n, we can express the n recursions and the visit to all neighbors in loop [2] 
as visiting all the edges, yielding a time complexity of 0{m ■ rfmax • <S'^('imax))- 

For graphs in general, where dmax could be bound by n, time complexity is 
0(n^ • S^{n)). Now, the main problem is to bound >S'((ii„ax) function. In general, 
this function can be exponential but for some family of graphs it can be polynomial. 
This last case is observed in most of the graphs issues form the Complex System 
field, whose have a heavy tailed degree distribution which can be bound by a power 
law. These last have also the property that m ^ n^, that is they are sparse. As 
the number of cliques is highly related to the number of triangles in the graph, and 
this is low because the graph is sparse, the number of cliques per vertex is also low. 
For these cases, we can model this function as S'((imax) = c^max with a G N. Then, 
knowing that is a bound of dmax, therefore the complexity yields 0{n f ) 



This can be done at the beginning for all vertices in graph G, taking 0(n ■ rfmax ■ log(<imax)). 




Next, we consider loop [21 taking 
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for /3 < 2 + a; and for /3 > 2 + a the elimination in [T] can dominate, reaching the 
bound 0{n^) or 0{n ■ dmax • logdmax)- 

Storage complexity. It can be computed as the space to storage graph G, which is 
O(ri-dniax), and the space occupied by all the list of sets. This last quantity can be 
computed as the length of each set in a„, which is bound by >5'(dinax)- Thus, counting 
dmax neighbors, the total storage complexity is 0{n ■ dmax ■ niax((iniax, 5'((imax)))- 

3. Applications 

In this section we illustrate Algorithm [T] and [D through an implementation devel- 
oped in python programming language [Alvarez- Hamelin, 2011] , and its application 
to some graphs showing their maximum clique. 

Firstly, we apply our algorithm to find cliques to some random graphs defined 
by [Erdos and Renyi, 1959] . It is shown in [Bollobas, 200T| [Bollobas and Erdos, 19"76] 
that an ER random graph has a high probability to contain a clique of size, 

(2) r - ^-^"g^ 

where n is the number of vertices and p is the probability that an edge exist between 
any pair of vertices. 



n 


p 


d 


r 


1 -^max 1 


|/C| : /C = {K G Km..} 


induced r + 1 


100 


0.01 


1 


2 


2 


60 


yes 


1000 


0.01 


10 


3 


3 


159 


yes 


10000 


0.01 


100 


4 


4 


372 


yes 


10000 


0.04642 


464.2 


6 


6 


5 


yes 



Table 1. Maximum cliques in Erdos Renyi graphs. 



Table [T] show the results, where the columns are: the size of the graph, the 
probability p, the average degree d, the computed r according to Equation [51 the 
size found by Equation [1] the number of different cliques (i.e., at least one vertex 
is different), and if an induced clique of size r + 1 were found. The last column is 
obtained adding new edges to build a greater clique than the maximum, it shows 
'yes' when this clique is found and 'not' if this is not found. Moreover, the 'yes' 
answer also means that we find just one clique of that size (see the number of clique 
r find in the original graph). We tested the algorithm on several graphs of each 
kind, obtaining the same results (excluding |/C| which changed some times). We 
display just one result of each kind. 

The result is evident, we always find the predicted maximum clique, even when 
an artificial one is introduced. 

Secondly, we applied our algorithm to a AS Internet graph. This graph has, as 
main properties, a power law degree distribution and most vertices of low degree 
are connected to the high degree ones. We used an exploration of [CAIDA, 1998| 
performed in September 2011. Figure [2] shows a visualization of this map obtained 
by LaNet-vi [Beiro et al., 2008] . This visualization is based on fc-core decomposi- 
tion. A /c-core is a the maximum induced subgraph such that all vertices have at 
least k degree [Seidman, 1983 Bollobas, 1984 . LaNet-vi paints each vertex with 



the rain-bow colors according to its shell index, i.e., the maximum core that a ver- 
tex belongs to. It also makes a greedy clique decomposition of the top core, i.e., the 
core with maximum fc, placing each clique in circular sector according to its size. 
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In this graph Algorithm [2] found a K2g while LaNet-vi found a A'24; this is 
displayed as the largest circular sector of red vertices in Figure [2j Moreover, this 
figure shows vertices of the i^29 as those enumerated from 01 to 29. It is possible 
to appreciate that, even if all vertices are in the top core, the heuristic of LaNet-vi 
do not find this clique. For cases where some vertex is not at the top core LaNet- 
vi never find the maximum clique. Algorithm [T] runs several times faster than 
Algorithm [21 but it not always find the maximum clique. For instance, for some 
starting vertices. Algorithm [1] found a K27 instead of K2g. 

Comparing the execution time of this graph and a ER graph of the same size, 
e.g., the same number of edges, we find that AS graph ends quicker than ER graph, 
since that its degree distribution follows a power law with /3 ~ 2.2. 

4. Discussion 

As we have already remarked this problem is NP-Complete, and this can be see 
from the complexity of Algorithm [51 which depends on the number of cliques that 
a vertex belongs to, or the Algorithm [H do not find always the maximum clique. 




Figure 2. Visualization of AS Internet map by LaNet-vi. Vertices 
in the maximum clique are labeled with numbers. 
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However, this papers aims to introduce a new approach to find chques, that seems 
to be faster that the classical algorithms. 
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